Multi-criteria Anomaly Detection using Pareto Depth Analysis
نویسندگان
چکیده
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. In most anomaly detection algorithms, the dissimilarity between data samples is calculated by a single criterion, such as Euclidean distance. However, in many cases there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such a case, multiple criteria can be defined, and one can test for anomalies by scalarizing the multiple criteria using a linear combination of them. If the importance of the different criteria are not known in advance, the algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we introduce a novel non-parametric multi-criteria anomaly detection method using Pareto depth analysis (PDA). PDA uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach scales linearly in the number of criteria and is provably better than linear combinations of the criteria.
منابع مشابه
Multi-criteria Anomaly Detection using Pareto Depth Analysis: Supplementary Material
1 Proofs of Theorems 1 and 2 Before presenting the proofs of Theorems 1 and 2 we need a preliminary result. Lemma 1. For any n ≥ 1 and A ⊂ R d measurable, we have
متن کاملAnomaly detection and classification for streaming data using partial differential equations
Nondominated sorting, also called Pareto Depth Analysis (PDA), is widely used in multi-objective optimization and has recently found important applications in multicriteria anomaly detection. Recently, a partial differential equation (PDE) continuum limit was discovered for nondominated sorting leading to a very fast approximate sorting algorithm called PDE -based ranking. We propose in this pa...
متن کاملDetection of Mo geochemical anomaly in depth using a new scenario based on spectrum–area fractal analysis
Detection of deep and hidden mineralization using the surface geochemical data is a challenging subject in the mineral exploration. In this work, a novel scenario based on the spectrum–area fractal analysis (SAFA) and the principal component analysis (PCA) has been applied to distinguish and delineate the blind and deep Mo anomaly in the Dalli Cu–Au porphyry mineralization area. The Dalli miner...
متن کاملA Data-Driven Framework for Visual Crowd Analysis
We present a novel approach for analyzing the quality of multi-agent crowd simulation algorithms. Our approach is data-driven, taking as input a set of user-defined metrics and reference training data, either synthetic or from video footage of real crowds. Given a simulation, we formulate the crowd analysis problem as an anomaly detection problem and exploit state-of-the-art outlier detection a...
متن کاملCombining Disparate Information for Machine Learning
Combining Disparate Information for Machine Learning by Ko-Jen Hsiao Chair: Alfred O. Hero This thesis considers information fusion for four different types of machine learning problems: anomaly detection, information retrieval, collaborative filtering and structure learning for time series, and focuses on a common theme – the benefit to combining disparate information resulting in improved alg...
متن کامل